# Harmful Corpus for Research Purposes

**Warning:** This directory contains materials that include harmful or offensive language. These files are intended strictly for research purposes (e.g., safety evaluation and adversarial testing of language and vision-language models).

## File Descriptions

- `derogatory_corpus.csv`:  
  A small dataset sourced from a prior harmful-Continuation-based adversarial jailbreak study (VAE). It contains explicitly harmful text (e.g., misogynistic, racist, or misanthropic content) used to optimize harmful-Continuation attacks. This file is not authored by us, and it is included solely for replication and comparative benchmarking.

- `benign_sentences.csv`:  
  A curated set of safe, non-harmful sentences designed to serve as benign conditioning prompts in our Benign-to-Harmful (B2H) jailbreak experiments. Authored by the current researchers.

- `harmful_words.csv`:  
  A list of target harmful words used in the B2H setting, mapped to benign phrases to guide the adversarial image optimization process. Also created by the authors.

## Ethics & Usage

This data is provided in the interest of improving AI safety research and understanding the limitations of current safety alignment techniques in large vision-language models (LVLMs).

- All examples are used in a tightly controlled experimental setting.  
- Outputs are redacted or filtered as appropriate.  
- Any use of these materials should strictly adhere to ethical guidelines and institutional review policies.

**Disclaimer:** These datasets do not reflect the personal views or beliefs of the authors. They are solely intended to support reproducible, responsible research on safety vulnerabilities in AI systems.
